Method and apparatus for image identification and comparison

ABSTRACT

A method and apparatus are provided for analyzing, identifying, and comparing images. The method can be used with any visually-displayed medium that is represented in any type of color space. An identified image can be authenticated, registered, marked, compared to another image, or recognized using the method and apparatus according to the present invention. At least one characteristic of an image&#39;s color space is selected and determined to generate a unique description of the image. This identification information is then used to compare different identified images to determine if they are identical according to a set of predetermined criteria. The predetermined criteria can be adjusted to permit the identification of images that are identical in part. In the preferred embodiment of the present invention, a software search application, such as a search engine or a spider, is used to locate and retrieve an image to be identified from an electronic network. A notification alarm is triggered when a duplicate image is located. In one embodiment, the present invention is implemented using a computer. One or more software applications, software modules, firmware, and hardware, or any combination thereof, are used to determine the identification information for the selected image characteristics, search for images, provide notification of identical images, and to generate a database of identified images.

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application is a divisional of U.S. patent application Ser.No. 09/271,112, filed Mar. 17, 1999, which is claims priority fromprovisional application No. 60/078,878, filed Mar. 20, 1998. The subjectmatter of each of these applications is incorporated herein byreference. This application claims priority under 35 U.S.C. Section 120from each of application Ser. Nos. 09/271,112 and 60/078,878.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] The present invention relates generally to image identificationand, more specifically, to a computer-implemented method for analyzing,identifying, and comparing images.

[0004] 2. Description of Related Art

[0005] With the development of computers and electronic networks such asthe Internet, it is now possible to create, represent, and store, andview electronic representations of visually displayed images such asphotographs, paintings, and prints. In addition to such electronicrepresentations of “hard-copy” images, computer generated art forms thatare created, stored, and viewed exclusively as electronicrepresentations are becoming more common.

[0006] Electronic representations, such as digital images, are extremelyeasy to duplicate. Unfortunately, it can be difficult or impossible todetermine whether an electronic image is an original image, or is aduplicate of the original. Furthermore, the Internet has greatlyfacilitated the transmitting of duplicated images. This can be asignificant problem for artists, copyright owners, and others who haveinterests in particular images.

[0007] Attempts have been made to mark electronic images to permitidentification of unauthorized copies. For example, a digital watermarkcan be added to an electronic image. A suspected duplicate image can beidentified by its hidden digital watermark. However, a digital watermarkis located at one or more specific locations on an electronic image.Thus, if the portion of an image in which the watermark is stored iscropped upon duplication, image identification will not be possibleusing the digital watermark.

[0008] Furthermore, a digital watermark must be affirmatively added toan electronic image. Therefore, it is not possible to use this method toidentify copies of images that either have not been digitallywatermarked, or that were made prior to the addition of a digitalwatermark.

[0009] In addition, a digital watermark may not survive the transfer ofan electronic image to printed format. For example, a duplicate digitalimage can be downloaded from the Internet and printed. The unauthorizedprint may not display the digital watermark.

[0010] It would be an advantage to provide a method and apparatus foridentifying an image without requiring the use of an identifying mark.It would be a further advantage if such method and apparatus enabled theidentification of altered duplicate images, such as cropped images. Itwould be yet another advantage if such method and apparatus wereavailable to search an electronic network to locate, compare, andidentify images.

SUMMARY AND OBJECTS OF THE INVENTION

[0011] The present invention is a computer-implemented method andapparatus for analyzing, identifying, and comparing images. The methodcan be used with any visually-displayed medium that is represented inany type of color space. An identified image can be authenticated,registered, marked, compared to another image, or recognized using themethod and apparatus according to the present invention.

[0012] In the present invention, an image's displayed composition isparsed to generate unique image characteristics. At least onecharacteristic of the image's color space is selected and determined fora displayed image. In the preferred embodiment of the invention, theselected characteristics include color distribution, color space usage,color range distance, and image size. The information determined foreach selected characteristic comprises a unique description of an image.This identification information can be then used to compare differentidentified images to determine if they are identical.

[0013] In the preferred embodiment of the present invention, a pluralityof color values are combined to provide an expressed color value. In oneembodiment, the color values are combined by grouping colors that cannotbe distinguished by visual inspection. In an alternative embodiment, thecolor values are combined by truncating a specified number of the lowerbits representing each color value and then by combining all colorvalues whose remaining bits are equal in value.

[0014] A set of predetermined criteria is used to ascertain whether asecond image is a duplicate of a first image. Such criteria can includethe percentage of identity of the determined characteristics of thecompared image. Thus, if the determined characteristics are identicalwithin the predetermined percentage, the images will be considered to beduplicates. The predetermined criteria can be adjusted to permit theidentification of images that are identical in part, such as a clippedcopy of an image compared to an original.

[0015] In the preferred embodiment of the present invention, a softwaresearch application, such as a search engine or a spider, is used toretrieve an image from an electronic network. The retrieved image canthen be identified using the method of the present invention. Thesoftware search application can be used to search an electronic network,such as the Internet, to seek out copies of an identified image. In oneembodiment, a notification alarm is provided when a duplicate image islocated.

[0016] In one embodiment, the present invention is implemented using acomputer. In this embodiment, identification information for an imagecan be stored in a computer-accessible database. The computer can beadapted for communication with an electronic network such as theInternet. One or more software applications are used to determine theidentification information for the selected image characteristics.Software applications are also used to compare images, providenotification of identical images, and to generate a database ofidentified images.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017]FIG. 1 is a block diagram of a computer network according to oneembodiment of the present invention;

[0018]FIG. 2 is a block diagram illustrating an apparatus for accessingan electronic network, according one embodiment of the presentinvention;

[0019]FIG. 3 is a flow chart of a method for identifying an imageaccording to the present invention;

[0020]FIG. 4 is a flow chart illustrating the determination of animage's color distribution according to the preferred embodiment of thepresent invention;

[0021]FIG. 5 is a diagram illustrating the use of a spider to search anelectronic network according to one embodiment of the present invention;and

[0022]FIG. 6 is a flow chart illustrating the use of a spider softwareapplication according to one embodiment of the present invention.

DETAILED DESCRIPTION

[0023] A method and apparatus for identifying images is described. Inthe following description, for purposes of explanation, numerousspecific details are set forth in order to provide a thoroughunderstanding of the present invention. It will be evident, however, toone skilled in the art that the present invention may be practicedwithout the specific details. In other instances, well-known structuresand devices are shown in block diagram form to facilitate explanation.The description of preferred embodiments is not intended to limit thescope of the claims appended hereto.

[0024] The present invention is a method and apparatus for analyzing,identifying, and comparing images. The method can be used with anyvisually-displayed medium that is represented in any type of colorspace. The present invention can be used for purposes including but notlimited identifying a particular image, authenticating an image as beingidentical to a particular image, registering an image, for example witha registry, organization, database, or digital library, marking an imagefor subsequent identification, or identifying copies of a particularimage.

[0025] In one embodiment, the present invention is implemented using acomputer. Such computer can include but is not limited to a personalcomputer, network computer, network server computer, dummy terminal,local area network, wide area network, personal digital assistant, workstation, minicomputer, and mainframe computer. The identification,search and/or comparison features of the present invention can beimplemented as one or more software applications, software modules,firmware such as a programmable ROM or EEPROM, hardware such as anapplication-specific integrated circuit (“ASIC”), or any combination ofthe above.

[0026]FIG. 1 is a block diagram of a computer network system 100according to one embodiment of the present invention. In computernetwork system 100, a network server computer 104 is connected to anetwork client computer 102 through a network 110. The network interfacebetween server computer 104 and client computer 102 can also include oneor more routers, such as routers 106 and 108. The routers serve tobuffer and route the data transmitted between the server and clientcomputers. Network 110 may be the Internet, a Wide Area Network (WAN), aLocal Area Network (LAN), or any combination thereof. In one embodimentof the present invention, the server computer 104 is a World-Wide Web(WWW) server that stores data in the form of ‘web pages’ and transmitsthese pages as Hypertext Markup Language (HTML) files over the Internetnetwork 110 to client computer 102. It should be noted that, althoughonly one server and client computer each are illustrated in networksystem 100, a network that implements embodiments of the presentinvention may include a large number of interconnected client and servercomputers.

[0027] For example, one or more software applications accessible to acomputer can be used to determine the identification information for theselected image characteristics. Software applications can also be usedto compare images, provide notification of identical images, and togenerate a database of identified images. Any or all of the softwareapplications or hardware configurations of the present invention can beimplemented by one skilled in the art using well known programmingtechniques and hardware components.

[0028] The original data source for the original reference image, andfor any subsequently-identified comparison images can be in anyappropriate form, including but not limited to processed film(black-and-white, color, or negatives), video, CD, CD-ROM, photographs,optical disks, magazines, brochures, newspapers, books, paintings, andcomputer images. Computer image data sources can be stored in any formatincluding but not limited to JPG, GIF, TIFF, PNG, PCX, MacPaint, GEM,IFF/ILBM, Targa, Microsoft Windows Device Independent Bitmap,WordPerfect Graphics, Sun Raster files, PBM, X Windows bitmaps, FITS,DXF, HPGL, Lotus PIC, UNIX plot format, PCL, Basic PostScript graphics,WMF, PICT, CGM, RIB, FLI/FLC, MPEG, QuickTIme animations, Kodak ICC,PDS, RIFF, SGI, XPM, HP Paintjet, PC Paint, Utah RLE, VICAR, and XPM.

[0029] In one embodiment, the present invention is a vendor-providedservice, with the image identification, search, and any imagecomparisons performed by the vendor for use by users or customers. Inthis embodiment, the software applications, firmware, and hardware forimplementing the invention reside with the vendor. A user canelectronically access information previously obtained by the vendor, canrequest that a search be performed for specific information, and canprovide an image for comparison with a database, file, library of storedimages, or any other image. In one embodiment, the user provides animage for identification by the vendor. The vendor stores theidentification information for this image and searches, for example onthe Internet, for duplicate images. The user is notified when aduplicate image is located.

[0030] In alternative embodiments of the present invention, the entireprocess and apparatus or any portion thereof can reside with one or moreusers or third parties. In this embodiment, the present invention can beimplemented as one or more software applications, software modules,firmware; and hardware that are provided to individual users for theirutilization.

[0031] The computer can be adapted for communication with an electronicnetwork such as the Internet. As a result, the method according to thepresent invention can be used to identify images stored on an electronicnetwork, such as images displayed on a World Wide Web (“Web”) page. FIG.2 is a block diagram illustrating an apparatus 200 for accessing anelectronic network, according one embodiment of the present invention.In this embodiment, a computer 202 is adapted for communication with anelectronic network 204 such as the Internet. Identified reference images206 can be stored on the computer, for example, in a database. A spideror search engine 208, also resident on the computer, can be used tosearch the electronic network for images. The identification ofretrieved images can then be performed using a software application 210resident on the computer.

[0032] In an alternative embodiment of the present invention, the imageidentification is performed manually. Similarly, the comparison of anidentified image with another image can be performed either manually, orby using a computer.

[0033] In the present invention, an image's displayed composition isparsed to generate unique image characteristics. At least onecharacteristic of the image's color space is selected and determined fora displayed image. The information determined for each selectedcharacteristic comprises a unique description of an image. Thisidentification information can be then used to compare differentidentified images to determine if they are identical.

[0034] In the preferred embodiment of the invention, the selectedcharacteristics include color distribution, color space usage, colorrange distance, and image size. The selected characteristics can bedetermined in any order. In alternative embodiments, any or all of thecharacteristics of a color space can be used to identify an imageaccording to the present invention. For purposes of explaining thepresent invention, the examples described herein use the RGB colorspace. However, any suitable display type or color space type can beused, including but not limited to RGB, YIQ, YUV, YDbDr, and YCbCr.

[0035] The RGB color space is an “additive” color system. In the RGBcolor space, all colors are represented according to the values of thered, green, and blue components required to produce each color. Each ofthe three component colors is divided into 256 digital steps. Therefore,black is represented as 0,0,0 and white, which contains the maximumamount of all three colors is 255,255,255.

[0036]FIG. 3 is a flow chart of a method for identifying an imageaccording to the present invention. The image data is obtained from thedata source and, if necessary, is converted to the color space beingused in the identification procedure 300. For example, a JPG file formatimage can be converted to the RGB color space.

[0037] Color ranges for the particular color space are then defined bydivided the total color range for the particular color space into aspecific number of groups 302. Image colors used in the image areassigned to their defined color ranges and color usage counts for eachcolor range are recorded 304. The average farthest distance betweencolor points in each color range is then derived 306. The image'sdisplay width and height for the current display medium is alsodetermined 308.

[0038] The image is identified by taking the average of the color rangedistribution, color usage counts, and color distances, and by makingeach of the averaged characteristics relative to 100% of allcharacteristic ranges 310.

[0039]FIG. 4 is a flow chart illustrating the determination of animage's color distribution according to the preferred embodiment of thepresent invention. In a first step, all possible display ranges for amedia type pixel are defined 400. For a RGB color space, the displayrange is from 0,0,0 through 255,255,255.

[0040] Each display range is divided into groups of N elements, where Nrepresents the total number of discreet elements desired 402. N can beany value from 1 to the maximum number of color values in a displayrange. For example, in an RGB color space, if N=32, then the first groupof 32 elements is from 0,0,0 through 7,255,255. The second group of 32elements is from 8,0,0 through 15,255,255. The third group of 32elements is from 16,0,0 through 31,255,255. The remaining groups aresimilarly determined, with the last group of 32 elements being from247,0,0 through 255,255,255. The image Color Distribution can begenerated using methods including but not limited to one or moresoftware applications, a calculator, or by hand calculation. Range sizescan differ among the selected characteristics in an image identificationprocedure according to the present invention.

[0041] The color value is then determined for each pixel in the image404. In the preferred embodiment of the present invention, a pluralityof color values are combined to provide an expressed color value 406. Inan alternative embodiment, however, no such color values are combined.The color values can be combined in several different manners. In oneembodiment, the color values are combined by grouping colors that cannotbe distinguished by visual inspection 408. For example, color233,233,233 can be considered the same color as 233,233,232 or233,232,234 for purposes of the present invention. In this example, thethree visually indistinguishable colors are considered to be the samecolor and are counted as one instead of three colors in the color spacerange. This method is used to advantage in averaging out display mediadifferences when comparing screen-captured images.

[0042] In an alternative embodiment, the color values are combined bytruncating a specified number of the lower bits representing eachcomponent color value 410. The truncated component color values arere-calculated. After the truncation step, all colors having the samecomponent color values are then combined 412. For example, the binaryrepresentation of the number 255 is 11111111. The RGB color space color233,233,233 would therefore be represented as:

[0043] 11101001,11101001,11101001.

[0044] The last four bits of each color component's value are underlinedfor emphasis. The last four bits of each color component's value aretruncated to produce the color value:

[0045] 11100000,11100000,11100000.

[0046] Similarly, the RGB color 236,236,236 is represented as:

[0047] 11101100,11101100,11101100.

[0048] The last four bits of each component color's value can betruncated to also produce the color value:

[0049] 11100000,11100000,11100000.

[0050] Therefore, in this preferred embodiment, the RGB colors233,233,233 and 236,236,236 will be considered to be the same color.This method is advantageous because it allows you to create a colortable of 4096 elements that can be stored and directly indexed inmemory.

[0051] The number of expressed color values in each group of N elementsfor which there is at least one pixel with a corresponding color valueis then determined 414. As an example, if the image included one pixelhaving the value 1,244,244, then the range 0,0,0 through 7,255,255,would have at a minimum one of its color space values used in the image.Only one color space element is considered to be used when the imageincludes a plurality of pixels whose color values are combined andconsidered to be the same color, as discussed previously.

[0052] Once all color space range elements used by pixels in the imageare determined, the total number of color space range elements of eachcolor space range can optionally be divided by the total color spacerange elements used in all the color space ranges 416. This willgenerate the percentage of elements used in each of the color spaceranges.

[0053] In the following example, the total number of color ranges istwo, and a total of three color elements are used: Range: 0,0,0 to127,255,255 128,0,0 to 255,255,255 Colors Used: 1 2 Total Colors Used:1     + 2     =  3 Range %: 1/3 = 33% 2/3 = 66%

[0054] In the previous example, the color space distribution is 33%, 66%for a two range color space. In alternative embodiments of the presentinvention, the range elements are represented by other methods includingbut not limited to averaging, and calculating the deviation from aspecific point.

[0055] The image's color space usage is determined by, for each colorrange defined in the image's color space, counting the number of pixelsthat use a color element in the color range. Once all such pixel usagecounts have been done, each color space range total pixel count canoptionally be divided by the total number of pixels used in the image.This generates the percentage of usage for each specific color range ofthe color space. Other representational methods such as averaging orcalculating deviation from a specific point can also optionally be used.

[0056] In the following example, the total number of color ranges istwo, and the image has four pixels. Three pixels use colors defined inthe first color range and one pixel uses a color defined in the secondcolor range. Range: 0,0,0 to 127,255,255 128,0,0 to 255,255,255 ColorElement 3     + 1     =  4 Usage Count: Usage %: 3/4 = 75% 1/4 = 25%

[0057] Color range distance is determined by determining the distancebetween the two farthest points of each color element defined in eachcolor range. For example, when distance is defined as:

Point 1 X ₁=20, Y ₁=10

Point 2 X ₂=30, Y ₂=15

Distance=|(X ₂ −X ₁)|×|(Y ₂ −Y ₁)|=|(30−20)|×|(15−10)|=50

[0058] The total distances for each color range are then averaged. Thecolor range distance can be represented by any other methods such asdeviation from a specific point. An example of color range distancingusing averaging is as follows:

Color Range Distance=(CRE₁+CRE₂+ . . . +CRE_(N))÷Total CRE's for a range

[0059] where CRE=Color Range Element.

[0060] Image size is derived by determining the width and height of animage. In one embodiment of the present invention, the width of an imageis defined as the number of color space units used from right-to-left ofthe image, and the height of an image is the number of color space unitsused from the top-to-bottom of the image. It is readily apparent to oneskilled in the art that the directions of measurement and the size ofthe color space units can be varied without departing from the scope andspirit of the present invention.

[0061] Once the selected characteristics of an image, such as the colordistribution, color space usage, color range distance, and image sizeaccording to the preferred embodiment of the present invention aredetermined, this data constitutes identification information for theimage. Depending on the number and type of selected characteristics,this identification information can uniquely identify the image.

[0062] In one embodiment of the present invention, the imageidentification information is used to identify copies of a referenceimage. The identification of such copies can be performed using methodsincluding but not limited to one or more software applications, acalculator, or by hand calculation.

[0063] The identification information can be used to authenticate animage. For example, a work of computer art can be authenticated bygenerating the identification information according to the presentinvention. This authentication does not require the use of a prior artidentification marker implanted within or associated with the image,such as a digital watermark. As a result, copies of an imageauthenticated according to the present invention can be readilyidentified even if the digital watermark has been cropped from thecopied image.

[0064] The identification information according to the present inventioncan be used to authenticate, catalog, index, retrieve, identify, andregister an image or images. In addition, the identification informationcan also be used to search for image copies including but not limited toreproductions, screen captures, and cropped areas. In one embodiment ofthe present invention, identification information for a reference and/orcomparison image is stored in a computer-accessible database.

[0065] In the preferred embodiment of the present invention, the imageidentification and/or search is conducted for images stored on anelectronic network, such as the Internet. However, in alternativeembodiments, the teachings of the present invention can equally beapplied to images stored on any type of storage or electronic storagemedium, network, or system, including but not limited to CD-ROMs,Digital Video Disks, billboards, films, videos, photographs, posters,newspapers, books, and magazines. For example, a photograph can beelectronically scanned and analyzed to determine its selectedcharacteristics. The identification information thus generated for thephotograph can be used to identify digital copies of the photograph thatare stored on the Internet, or hard copies of the photograph on posters.

[0066] In the preferred embodiment of the present invention, a softwaresearch application, such as a search engine or a spider, is used toretrieve an image from an electronic network. For the purposes of thisapplication, an Internet spider is a software application running on anode on a network. The spider software application is programmed toaccess other hosts' websites on the Internet and retrieve referenceinformation from the HTML pages and images found on the visited sites.The retrieved data is loaded and stored on at least one database. Thisdatabase can be on the same computer as the spider software application,or on another computer(s).

[0067] For the purposes of this application, a search engine is asoftware application that is programmed to use the retrieved data(reference information) stored in the database by the spider. The searchengine locates websites that contain requested information and imagesthat are based upon the stored reference information collected by thespider.

[0068] The teachings of the present invention can be implemented eitherusing a proprietary or a commercially-available spider or search engine.Such commercially-available spider software applications or searchengines include but are not limited to America On-Line's Web Crawler,Compaq Corporation's Alta Vista, Yahoo! Corporation's Yahoo!, InfoSeekCorporation's InfoSeek, Lycos Corporation's Lycos, and @HomeCorporation's Excite. Any other search software application of othersearching technique known to one skilled in the art can also be used.

[0069] An image retrieved using such spider or search engine can then beidentified using the method of the present invention. The spider orsearch engine can be used to search an electronic network, such as theInternet, to seek out copies of an identified image. FIG. 5 is a diagramillustrating the use of a spider to search an electronic networkaccording to one embodiment of the present invention.

[0070] In FIG. 5, a spider 502 according to the present invention is incommunication with a database 504 that contains image identificationinformation, also according to the present invention. The spider is alsoin communication with an electronic network, such as the Internet 500.

[0071] The spider is programmed to access different sites on theInternet 506, 508, 510, 512, 514, 516, 518, 520, 522, 524, 526. Thesesites can be selected by any appropriate means as described below infurther detail. Images located by the spider can be retrieved and addedto a database 528, identified and compared to image identificationinformation from the database 504. Duplicate, copied, cropped, andtransformed versions of the reference image can thereby be located andidentified. The database 528 in which the spider stores retrievedreference information can be the same database as the imageidentification information database 504, or can be a separate databaseas illustrated in FIG. 5.

[0072]FIG. 6 is a flow chart illustrating the use of a spider softwareapplication according to one embodiment of the present invention. In theFigure, the reference image identification information is provided tothe spider 600. The search parameters, such as the Internet sites thatthe spider is to search are also provided to the spider 602. The spiderthen accesses and searches the selected search locations 604. The spidercan be configured to, for example, locate and identify each imageavailable on a visited site. The image identification can be performedeither by the spider software application itself, or by another softwareapplication resident upon a computer to which the spider transmits alocated image.

[0073] For each located image that is identified, the located imageidentification information is stored in a database 606. This locatedimage identification information is compared to reference imageidentification information 608 for one or more images that is stored inthe same or in a separate database.

[0074] A report indicating a possible duplicate image is generated foreach located image whose identification information matches within aselected percentage of the reference image identification data 610. Asecondary comparison can be performed on any such possible duplicateimages located 612. For example, a possible duplicate image can bereloaded to the computer and compared to the reference image usingpattern matching, quadrant frequency, usage counts, or any otherapplicable method. The results of the secondary comparison can then bereported.

[0075] In one embodiment, the spider or search engine is provided withan alarm or notification feature. Such features can include notifying anoperator that an image match has occurred, notifying another party thatan image match has occurred, and notifying the addressee of a particularsite that an image(s) on that site matches an image(s) on another site.An alarm or notification can be visually displayed by using, forexample, a text message, flashing display, color display, different fonttype or size, shading, borders, graying out, highlighting, animation,audio display, sound alarm, audibly broadcast message, and printednotice.

[0076] An alarm or notification can be stored for later retrieval,configured to display at particular times, or conditioned upon theoccurrence of particular events. For example, the notification can betriggered to display every ten minutes, every time an image match isfound, every time ten image matches are found, when no image match isfound, to identify the total number of images on a site or electronicnetwork, and to identify the total number or percentage of matchingimages or sites having matching images.

[0077] A search for duplicate images can be performed at the directionof user, or can be performed automatically. For example, the user canhave a particular image identified and compared to an authenticatedimage to determine of the images are identical. Alternatively, thesearch engine or spider can be configured to search for and to determinethe selected characteristics of am image or group of images. The spideror search engine can be programmed to locate all images at a particularsite, locate images and identify only specific images, locate andidentify all images at a particular site, compare located images with apredetermined identified image, and compare located images with eachother to identify sites containing identical images.

[0078] An example illustrating one embodiment of the present inventionis hereby provided as Example 1. In Example 1, an image in the RGB colorspace is identified and compared to another image to determine if thetwo images are duplicates. A duplicate image according to the presentinvention maintains a certain percentage of identity with the referenceimage. This percentage can vary according to the method of duplication,or according to whether the image was duplicated in its entirety.

[0079] According to one embodiment of the present invention, the closerthe selected characteristics of the suspected duplicate image are tothose of the reference image, the greater the amount of duplication ofthe two images. For example, a direct copy of an image file to anotherfile of the same type and storage specifications would approach a 100%match of selected characteristics. However, a copy of an image filefrom, for example, the JPG format to the GIF format, a cropped copy, ora copy saved as a smaller JPG file could alter the selectedcharacteristics. Therefore, the percentage of identity between theselected characteristics of the suspected duplicate image and thereference image would be less than 100%.

[0080] In one embodiment of the present invention, a set ofpredetermined criteria is used to ascertain whether a second image is aduplicate of a first image. Such criteria can include the percentage ofidentity of the determined characteristics of the compared image. Thus,if the determined characteristics are identical within the predeterminedpercentage, the images will be considered to be duplicates. Thepredetermined criteria can be adjusted to permit the identification ofimages that are identical in part, such as a clipped copy of an imagecompared to an original.

[0081] As an example, a comparison of two images to determine 100%identity of selected characteristics could be used to identify a directcopy of an image file to another file of the same type and storagespecifications, as described above. Such a comparison might not identifyan image copied to another format in which certain image characteristicsare altered. However, a comparison to determine 80% identify of selectedcharacteristics might be sufficient to identify such duplicate imagestored in a different format.

[0082] In Example 1, four characteristics of the image composition areselected, Color Distribution, Color Space Usage, Color Range Distance,and Image Size. One skilled in the art will recognize that other amountsand types of image characteristics could be used to identify an imageaccording to the present invention. For example, Image Depth could be aselected characteristic for a three-dimensional rendering, and ImageSize might not be selected for standard image size databases. The stepsperformed in Example 1 can be performed in any suitable order. Steps canbe combined and additional steps can be added to accomplish the imageidentification according to the present invention.

EXAMPLE 1

[0083] Image1=Reference image

[0084] Image2=Comparison image

[0085] I. Image Identification:

[0086] Number of Ranges for:

[0087] Color Distribution=32

[0088] Color Space Usage=32

[0089] Color Range Distance=32

[0090] Step 1: All colors used in the reference image are assigned toone of 32 different ranges in the RGB color space to generate the imageColor Distribution, starting with white 255,255,255 and ending withblack 0,0,0. For example, Range 1 is 255,255,255 through 247,0,0 andRange 2=246,255,255. The remaining ranges are similarly determined.

[0091] Step 2: The percentage of color elements used by the referenceimage in each RGB color space range is generated.

[0092] Step 3: The percentage that each color range is used in thereference image in the RGB color space is generated.

[0093] Step 4: The average distance between the two farthest points ofeach color in a Color Range is generated for each Color Range.

[0094] Step 5: The image height and width for the specific selecteddisplay model is determined.

[0095] II. Image Comparison:

[0096] Step 6: Image Distribution Characteristic (“IDC”)

[0097] a. For each Range defined for the IDC:

[0098] Compare Image 1, Ranges 1 through N to Image2, Ranges 1 throughN.

[0099] b. Record the differences found in each of the comparisons ofstep 6a . Ranges not used by either image are ignored.

[0100] c. The image differences results are subtotaled for each rangeand totaled for all ranges. The total for all ranges is divided by thetotal number of ranges.

IDC=[((Image1, Range1)÷(Image2, Range1))+((Image1, Range2)÷(Image2,Range2))+ . . . . . . ((Image1, RangeN)÷(Image2, RangeN))]÷N

[0101] Step 7: Image Usage Characteristic (“IUC”)

[0102] a. For each Range defined for the IUC:

[0103] Compare Image1, Ranges 1 through N Usage to Image2, Ranges 1through N Usage.

[0104] b. Record the differences found in each of the comparisons ofstep 7a. Ranges not used by either image are ignored.

[0105] c. The image differences results are subtotaled for each rangeand totaled for all ranges. The total for all ranges is divided by thetotal number of ranges to derive the individual Usage Characteristicmatch for the two images.

IDC=[((Image1, Range1 Usage)÷(Image2, Range1 Usage))+((Image1, Range2Usage)+(Image2, Range2 Usage))+((Image1, RangeN Usage)+(Image2, RangeNUsage))]÷N

[0106] Step 8: Image Distance Characteristic (“IDIC”)

[0107] a. For each Range defined for the IDIC:

[0108] Compare Image1, Ranges 1 through N Distance to Image2, Ranges 1through N Distance.

[0109] b. Record the differences found in each of the comparisons ofstep 8a. Ranges not used by either image are ignored.

[0110] c. Optionally weight distance. For each Range distance, multiplyit by the IUC percentage of the corresponding Image1 Range (or by someother selected Usage factor). This procedure weights the distance to therelative use of its Color Range. Thus, when comparing one distance rangeto another, a distance whose representative color forms a largerpercentage of the image will be weighted more heavily than a distancewhose representative color forms a smaller percentage of the image.

[0111] d. The image differences results are subtotaled for each rangeand totaled for all ranges. The total for all ranges is divided by thetotal number of ranges to derive the individual Distance Characteristicmatch for the two images.

IDIC=[(((Image1, Range1 Distance)×(Image1, Range1 Usage))÷(Image2,Range1 Usage))+(((Image1, Range2 Distance)×(Image1, Range2Usage))÷(Image2, Range2 Usage))+ . . . . . . (((Image1, RangeNDistance)×(Image1, RangeN Usage))÷(Image2, RangeN Usage))]÷N

[0112] Step 9: Image Size Characteristic (“ISC”)

[0113] a. For each Range defined for the ISC:

[0114] Compare Image1, Ranges 1 through N Size to Image2, Ranges 1through N Size.

[0115] b. Record the differences found in each of the comparisons ofstep 9a. Ranges not used by either image are ignored.

[0116] c. The image differences results are subtotaled for each rangeand totaled for all ranges. The total for all ranges is divided by thetotal number of ranges to derive the individual Size Characteristicmatch for the two images.

ISC=[((Image1, Range1 Size)÷(Image2, Range1 Size))+((Image1, Range2Size)÷(Image2, Range2 Size))+ . . . . . . ((Image1, RangeNSize)÷(Image2, RangeN Size))]+N

[0117] Step 10: Average the four selected group characteristicdifferences to derive the Percent Probability (“PP”) that Image2 is acopy of reference Image1:

PP=(IDC+IUC+IDIC+ISC)÷4

[0118] Depending upon the probability ranges selected for the comparisonprocess, a PP of 100 can be considered to be a 100% match. PP ranges canbe selected from 0% to 100%. Second level or above testing can beperformed to confirm an image match. For example, one or both images canbe re-analyzed using different characteristics, visually inspected,analyzed using pattern matching techniques to confirm such match. Suchtests can be computer-implemented, performed by a person, or both.

[0119] Although the present invention has been described with referenceto specific exemplary embodiments, it will be evident that variousmodifications and changes may be made to these embodiments withoutdeparting from the broader spirit and scope of the invention as setforth in the claims. Accordingly, the specification and drawings are tobe regarded in an illustrative rather than a restrictive sense.

[0120] For example, the present invention can be used with data, images,libraries and files stored in any suitable file or data storage programincluding but not limited to Claris Filemaker, Microsoft's Office andExcel, and the database programs and applications of Lotus, OracleCorporation, Informix, and Sybase.

What is claimed is:
 1. A method for analyzing image information,comprising the steps of: selecting at least one characteristic of acolor space; and determining the selected characteristic for a displayedimage; wherein the determined image characteristic can be used toidentify the image.
 2. The method of claim 1, further comprising thesteps of: determining the image's color distribution; determining theimage's color space usage; determining the image's color range distance;and determining the image's image size; wherein the determined colordistribution, color space usage, color range distance, and image sizecan be used to identify the image.
 3. The method of claim 2, wherein thedetermination of the image's color distribution further comprises thesteps of: defining all possible display ranges for a pixel of a selectedcolor space; dividing the ranges into groups of N elements, wherein N isthe total number of desired discrete finger print elements; determiningthe color value for each pixel in the image; and determining the numberof expressed color values in each group of N elements for which there isat least one pixel with a corresponding color value.
 4. The method ofclaim 3, further comprising the step of, for each group of N elements,dividing the number of expressed color values of the group with thetotal number of color values for all groups.
 5. The method of claim 3,further comprising the step of combining a plurality of color values asa single expressed color value.
 6. The method of claim 5, wherein theplurality of color values are combined by grouping colors that cannot bevisually distinguished.
 7. The method of claim 5, wherein, in an RGBcolor space, the plurality of color values are combined by truncating aspecified number of the lower bits representing one or more componentcolors of a first color value; determining the value of the componentcolors of the first color value after the truncation; and combining thefirst color value with any other color values whose color components areequal in value to the value of the component colors of the first colorvalue after the truncation.
 8. The method of claim 2, wherein thedetermination of the image's color space usage further comprises thestep of, for each color range defined in the image's color space,counting the number of pixels that use a color element in the colorrange.
 9. The method of claim 8, further comprising the step of, foreach specific color range of the color space, dividing the total numberof pixels using a color element in the color range by the total numberof pixels used in the image to generate the percentage of usage for eachspecific color range of the color space.
 10. The method of claim 2,wherein the determination of the image's color range distance furthercomprises the steps of: determining the distance between the twofarthest points of the color element for each color element of a colorrange defined in the image's color space; and averaging the totaldistances for all color elements in the color range.
 11. The method ofclaim 2, wherein the determination of the image's image size furthercomprises the step of determining the width and height of the image. 12.The method of claim 1, further comprising the step of comparing at leastone determined image characteristic of a first image with the determinedimage characteristic of at least a second image.
 13. The method of claim12, further comprising the step of determining whether the first imageis identical to the compared second image.
 14. The method of claim 1,wherein the determination of the selected characteristic for a displayedimage is computer-implemented.
 15. A method for comparing a plurality ofimages, comprising the steps of selecting a plurality of characteristicsof a color space; determining the selected characteristics for a firstdisplayed image; determining the selected characteristics for at least asecond displayed image; comparing the determined characteristics of thefirst and second displayed images; and using a set of predefinedcriteria to determine whether the determined characteristics of thefirst displayed image match the determined characteristics of the atleast a second displayed image.
 16. The method of claim 15, wherein theset of predetermined criteria includes the percentage of identicalcharacteristics.
 17. A method for identifying a copy of an image,comprising the steps of: identifying a first image; storingidentification information for the first image; identifying a secondimage; and comparing identification information for the second imagewith the stored identification information for the first image; whereinif a predetermined set of criteria is met, the second image is identicalto the first image.
 18. The method of claim 17, wherein the first andsecond images are compared manually.
 19. The method of claim 17, whereinthe first and second images are compared automatically.
 20. The methodof claim 17, wherein the first and second images are retrieved from anelectronic medium.
 21. The method of claim 20, wherein the first andsecond images are manually or automatically retrieved from theelectronic medium.
 22. The method of claim 21, further comprising thestep of using a search engine to locate the first or second image froman electronic network.
 23. The method of claim 21, further comprisingthe step of using a spider to locate the first or second image from anelectronic network.
 24. The method of claim 23, wherein the spider isconfigured to identify any located images.
 25. An article of manufactureembodying a program of instructions executable by a computer, theprogram of instructions including instructions for: locating a secondimage from an electronic network; determining selected characteristicsof the second image; comparing the selected characteristics of thesecond image with selected characteristics of a first image; anddetermining whether the first and second images are identical based uponthe comparison of selected characteristics.
 26. The article ofmanufacture of claim 25, further comprising means for providingnotification of identical first and second images.
 27. The article ofmanufacture of claim 26, wherein the notification means is an alarm. 28.The article of manufacture of claim 25, further comprising means forgenerating a database of identified images.
 29. A computer forcommunication with an electronic network, the computer comprising: meansfor locating a second image from an electronic network; means fordetermining selected characteristics of the second image; means forcomparing the selected characteristics of the second image with theselected characteristics of a first image; and means for determiningwhether the first and second images are identical based upon thecomparison of selected characteristics.
 30. The computer of claim 29wherein the image characteristics are selected from the group consistingof image distribution characteristics, image usage characteristics,image distance characteristics, and image sizing characteristics. 31.The computer of claim 29, further comprising means for storingidentification information of an image.
 32. The computer of claim 29,wherein the means for determining whether the first and second imagesare identical further comprises means for comparing at least onepredetermined criterion.
 33. A method for retrieving image informationfrom an electronic network, comprising the steps of: using a searchsoftware application to locate at least a second image from theelectronic network; determining selected characteristics of the secondimage to identify the second image; and comparing the selectedcharacteristics of the second image with selected characteristics of afirst image; wherein if a predetermined set of criteria is met, thesecond image is identical to the first image.
 34. The method of claim33, wherein the search software application is selected from the groupconsisting of search engines and spiders.
 35. The method of claim 33,wherein the search software application is adapted to identify anylocated images.